AlphaTims tutorial

This brief tutorial will introduce you to the basic usage of the AlphaTims API in Python. It is divided in five parts:

  1. First things first
  2. Reading and saving data
  3. Slicing
  4. Plotting
  5. Advanced DIA example

First things first

If AlphaTims is not installed yet, install it with (NOTE it is highly recommended to do this in a clean conda environment):

Import the AlphaTims utils libraries to use it (NOTE: this may take a few seconds):

By default all but one CPU threads are used by AlphaTims. A log file is by default written to the log folder in AlphaTims root directory.

Both uptions can be updated with the commands:

To ensure full reproducibility, it is best to always log the details of your AlphaTims installation and OS:

Now you can load the other AlphaTims libraries. When an editable version is installed with the command (pip install -e alphatims), libraries can interactively be reloaded with the following command when the source code is modified:

Note that AlphaTims is fully documented and that help functions are available, e.g.:

Reading and saving data

A complete Bruker .d folder (available for download on AlphaTims' GitHub page) can be read into memory with the following command:

IMPORTANT NOTE: available RAM should be roughly twice the size of the Bruker .d folder. This is easily checked with the output of alphatims.utils.show_platform_info()

In our tutorial, we used MacOS. During our very first command import alphatims.bruker, we noticed a warning message stating that "No Bruker libraries are available [...], mobility and m/z values need to be estimated". For this tutorial dataset these estimations are fine, but in general we advise to use Windows or Linux to avoid estimation. After importing the data on a Windows or Linux PC, it can be saved to a portable HDF file:

This HDF is a single portable file containing all raw data, that can be accessed on all Operating Systems regardless of how it was created. As such, reading of correct mz and mobility values can be done on a Windows or Linux machine, while further processing can be done on any other OS. An additional benefit is that loading of HDF files is roughly three times faster than loading of a raw Bruker .d folder.

Slicing

A TimsTOF data object can be sliced in the five following dimensions:

  1. LC: rt_values, frame_indices
  2. TIMS: mobility_values, scan_indices
  3. QUAD: quad_mz_values, precursor_indices
  4. TOF: mz_values, tof_indices
  5. DETECTOR: intensity_values

In each dimension, you can slice indices with integers and values with floats just as you would do with any normal Python object. Additionally it allows slicing with iterable just as in Numpy. The result of a slicing operation is a pd.DataFrame by default. If the slicing seems confusing, remember that you can always look at the help function.

As a simple example, datapoints for the precursor with index 999 can be obtained by slicing the data in the third (i.e. QUAD) dimension (The first time you slice the data, it can be relatively slow because AlphaTims uses just-in-time (JIT) compilation from the Numba package. All subsequent slices should be a lot faster):

Other slicing options include traditional Python slices or iterables. Multiple dimensions can be defined at the same time, for instance:

Alternatively, a (partial) dictionary can be passed that describes the desired coordinates:

Instead of returning a pd.DataFrame, it is also possible to return the raw indices of the ions that satisfy the filter by setting the last element of the slice to "raw":

Converting these raw indices to the appropriate coordinates or even a whole pd.DataFrame can be done with convenience functions such as:

Plotting

AlphaTims' plotting module provides a few convenient plots. Note that they can only be used if the gui option is used during installation (pip install "alphatims[plotting]"). In this tutorial we will also load the holoviews package to save our (interactive) plots.

A key plot that is always useful before diving in the data is a total ion chromatogram (TIC):

Other useful plots include heatmaps in different dimensions. For instance, we can check the calibrant spray that is used for the TIMS:

Of course we can also investigate 1D plots such as spectra for individual precursors of interest (not that this requires "raw" indices as input and not a pd.DataFrame):

Advanced DIA example

AlphaTims was developed to access both ddaPASEF and diaPASEF. Importing a DIA sample (available for download on AlphaTims' GitHub page) is done in exactly the same way as importing a DDA sample:

With AlphaTims slicing mechanism, it is possible to e.g. manually extract fragment traces from a precursor and overlay them in e.g. a single XIC. An example is the following peptide:

with fragments:

Seq # B Y #
Y 1 164.07065 973.44145 7
N 2 278.11358 810.37812 6
D 3 393.14052 696.33520 5
T 4 494.18820 581.30825 4
F 5 641.25661 480.26057 3
W 6 827.33592 333.19216 2
K 7 955.43089 147.11285 1

First, we define the following plotting function.

NOTE: This function is not optimized and hence only defined as an example here in the tutorial instead of as an integral part of AlphaTims. As this functionality focuses more on visualization of already analysed data, we refer you to AlphaViz.

Plotting overlaying XICs can now be done with a single command. Note that this plot is fully interactive and that e.g. individual fragments can be selected by clicking the legend.

However, timsTOF data also has a mobility dimension. It can be useful to inspect this dimension in combination with the rt dimension. This can be done by supplying the heatmap argument to the inspect_peptide function. Note again that this plot is fully interactive and that zooming and panning is done simultaneously for all subplots.